GIS, Big Data, and a Tweet Corpus Operationalized via Natural Language Processing
نویسندگان
چکیده
Whereas ad hoc single domain Big Data inquiry is successful, observation of a multi-domain GIS artifact needs consideration. A GIS solution for multi-domain data analysis must provide visualization and overt statistical analysis tools, e.g., regression capabilities of constituent data streams, in order to enable largescale dataset processing and evaluation. Such guidelines direct inquiry and creation of a robust GIS artifact considering a social media tweet corpus and a domain specific crime dataset. The tweet corpus is operationalized via natural language processing treatments and used in GIS artifact construction and evaluation. Although results are not statistically significant and visualizing crime data is not novel, learning how to combine the two in predictive ways via GIS is. As such, extensions and possible future work support social media natural language processing techniques and Big Data processing for predictive crime-based incident interactions as front-run by real-time social media analysis.
منابع مشابه
Big Social Data and GIS: Visualize Predictive Crime
Social media is a desirable Big Data source used to examine the relationship between crime and social behavior. Observation of this connection is enriched within a geographic information system (GIS) rooted in environmental criminology theory, and produces several different results to substantiate such a claim. This paper presents the construction and implementation of a GIS artifact producing ...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملMining Geospatial Path Data from Natural Language Descriptions
In this paper, we describe the TEGUS system for mining geospatial path data from natural language descriptions. TEGUS uses natural language processing, GIS entity databases, and graph-based path finding to predict lat/lon paths based only on natural language text input. We also report on preliminary results from experiments on a corpus of path descriptions.
متن کاملAn architecture for Malay Tweet normalization
Research in natural language processing has increasingly focused on normalizing Twitter messages. Currently, while different well-defined approaches have been proposed for the English language, the problem remains far from being solved for other languages, such as Malay. Thus, in this paper, we propose an approach to normalize the Malay Twitter messages based on corpus-driven analysis. An archi...
متن کاملWildlife Damage Estimation and Prediction Using Blog and Tweet Information
Wildlife damage estimation and prediction using blog and tweet information is conducted. Through a regressive analysis with the truth data about wildlife damage which is acquired by the federal and provincial governments and the blog and the tweet information about wildlife damage which are acquired in the same year, it is found that some possibility for estimation and prediction of wildlife da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015